Solution of the 2-star model of a network 
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The p-star model or exponential random graph is among the oldest and best-known of network 
models. Here we give an analytic solution for the particular case of the 2-star model, which is 
one of the most fundamental of exponential random graphs. We derive expressions for a number 
of quantities of interest in the model and show that the degenerate region of the parameter space 
observed in computer simulations is a spontaneously symmetry broken phase separated from the 
normal phase of the model by a conventional continuous phase transition. 
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There has in recent years been a surge of interest within 
the physics community in the properties of networks, in- 
cluding the Internet, the world wide web, and social and 
biological networks of various kinds 0, H 110 

Work 

has been divided between studies of specific real-world 
networks, along with the development of measures and 
algorithms for their analysis, and the creation of models 
to predict and explain network behavior. It is on models 
that we focus here. 

Network modeling goes back at least as far as the 
well-known random graph or Bernoulli graph, studied by 
Solomonoff and Rapoport in the early 1950s and fa- 
mously by Erdos and Renyi a decade later. The ran- 
dom graph however is a poor model for most real-world 
networks, as has been argued by many authors 0,0,0, 
and so other models have been developed. Recent at- 
tention has focused particularly on generalized random 
graphs such as the configuration model [ljj and on 
generative models, particularly models of growing net- 
works El 113 ■ There is, however, another class 
of network models that, while widely used and valuable, 
has attracted little attention in the physics community, 
namely the class of "exponential random graphs" or "p- 
star models." Building on early statistical work by Be- 
sag [T^ | , exponential random graphs were first studied in 
the 1980s by Holland and Leinhardt [l4|, and later devel- 
oped extensively by Strauss and others 0, . Today, 
they are commonly used as a practical tool by statisti- 
cians and social network analysts [l7L Hsl ITgj . 

Despite their widespread adoption, few analytic results 
are known for exponential random graphs: most work has 
made use of computer simulation to fit models to observa- 
tional data and evaluate model predictions. Exponential 
random graphs however are ideally suited to study using 
the techniques of statistical physics. Recently, physicists 
have examined exponential random graph models of net- 
work assortativity pol l2l) and transitivity [22^. Here 
we take a different approach and show how physics tech- 
niques can be used to derive analytically the behavior 
of one of the most fundamental of exponential random 
graph models, the 2-star model. We view this solution 
not only as a calculation of interest in its own right, but 
also as a demonstration of the way in which physics tech- 
niques can be fruitfully applied to problems from other 



II. THE MODEL 

The exponential random graph is an ensemble model. 
One defines an ensemble consisting of the set of all sim- 
ple undirected graphs with n vertices and no self-edges 
(i.e., networks with either zero or one edge between each 
pair of distinct vertices) and one specifies a probabil- 
ity P(G) for each graph G in this ensemble. Proper- 
ties of the model are calculated as averages over the 
ensemble. Let us define the graph Hamiltonian, also 
referred to by statisticians as a log odds ratio, to be 
H(G) = F - lnP(G). Here F (usually called the free 
energy) is any convenient origin for the measurement of 
the Hamiltonian, such as, for instance, the log of the 
probability of the empty graph (i.e., the probability of 
n vertices with no edges). Then 



P(G) 



-H{G) 



-H(G) 



(1) 



Z is the graph partition function and many quantities of 
interest can be calculated from it, or alternatively from 
the free energy. 

So far, this model is entirely general, but progress is 
made by assuming the Hamiltonian to be a linear com- 
bination of scalar graph observables, such as number of 
edges, degree sequences, or clustering coefficients. In this 
paper we study one of the simplest nontrivial cases, the 2- 
star model, for which H(G) — Q\m{G)+62s{G), where 6\ 
and #2 are independent parameters, m(G) is the number 
of edges in the graph and s(G) is the number of "2-stars." 
A 2-star is a pair of edges that share a common vertex. 

Let us denote by fcj the degree of vertex i. Then 
m(G) = \ J2i ki and s(G) — \ J2% ki(h — 1), and hence 
we can write the Hamiltonian in the form 



n — 1 ' ^— ' 



(2) 



where the "coupling constant" J = —^(n — 1)6*2 and the 
"field" B = §(02-0!). The factor (n-1) in the definition 
of J is not strictly necessary, but it makes the equations 
simpler later on. 
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There are two analytic approaches from statistical me- 
chanics that can be brought to bear on problems like 
this. The first is to use perturbation theory j2^| and 
the second is to use non-perturbative techniques, usu- 
ally based on the Hubbard-Stratonovich transform and 
saddle-point expansions |23| - Here we make use of the 
latter to solve the 2-star model. 



III. ANALYTIC APPROACH 

Our goal is to calculate the partition function Z, 
Eq. JQ), or equivalcntly the free energy. First, we in- 
troduce auxiliary fields fa on the vertices of the graph 
using the Hubbard-Stratonovich relation 



«p(j*?/(n-l)) = \/^r^ 

dfa exp(-(n - 1) Jfa 2 + 2Jfak t ), 



(3) 



which gives 

>-l)Jl" /2 



J 3>4> expf-(n- l)jy\ 

i 

^exp(^(2J0 J +i?)fc ! :), (4) 



where @<fr indicates the path integral over the fields {fa} 
and we have interchanged the order of the integral and 
the sum over graphs G. 

The sum over graphs can now be performed by defining 
the symmetric adjacency matrix equal to 1 if there 
is an edge between vertices i and j and zero otherwise. 
Then, noting that ki — . Uij , we can write 

5^(2 Jfa + B)h = ^2(2J<f>i + B)<Tij 

i ij 

= J2[ 2J (<t>i + <f>j)+2B]vij- (5) 

Since ov,- is symmetric, its values for i < j completely 
define the graph, and hence 

G i i<j <Tij—0 

= + e 2J(01+0j)+2B ). 

i<j 

Substituting this result into Eq. (QJ, we then get 

Z = J ^c-™, 
where the effective Hamiltonian ffl is 



(G) 



(7) 



Jf(fa = (n-l)J^0 2 -i^ln(l + e 2J ^+^)+ 2B ) 

i i^j 

- inln((n- 1)J). (8) 



Thus we have transformed our network model into a 
field theory of a continuous scalar field on n sites, which 
can be solved using a variety of methods. The simplest 
mean-field approach is to ignore fluctuations and assume 
fa always to be equal to its most probable value, which 
occurs at the saddle point 

— = = 2{n-l)Jfa-J Y\ [t&nh(j(4> i +4> j )+B)+i\. 

This has a symmetric solution fa = fa for all i with 

fa = \ [tanh(2 Jfa + B) + l] . (10) 

This quantity has a simple physical interpretation. 
The mean degree (k) of a vertex in the graph is given 
by the derivative of the free energy thus: 



(k) 



7? ^ ^ 



1 dF 
ndB 



1 



= 2^L( tanh ( J (^+^) +i3 ) +1 V f 11 ) 

where ( . . . ) j indicates an average in the cf> ensemble of 
Eq. J7J . Making the mean-field assumption of Eq. i|l(J[l , 
this becomes (k) = (n — l)fa and hence fa is simply 
proportional to the mean degree of a vertex, within the 
mean-field approximation. The quantity (fc) j(n — 1) is 
called the "connectance" of the graph — it is the fraction 
of possible edges that are actually present and is a mea- 
sure of the mean density. So we could also say that fa 
is equal to the connectance. This allows us to interpret 
Eq. (fTT)f) very directly. For J < 1, this equation has only 
a single solution, but for J > 1 we have three coexisting 
solutions when B is sufficiently close to — J. Only the 
outer two solutions are stable, giving us a bifurcation at 
J c = 1 corresponding to a continuous phase transition 
at this point to a symmetry broken state exhibiting two 
phases, one of high density (typically nearly a complete 
graph) and one of low density. We show a plot of the 
solution of HI 0(1 in the main panel of Fig. 

Along the line B = — J the Hamiltonian J2| is symmet- 
ric with respect to the interchange of edges and "holes" — 
the absence of edges between vertex pairs. In the inset 
to Fig. H w e show the solution for the connectance as 
a function of J along this symmetric line and the plot 
shows the bifurcation clearly. 

To move beyond the mean-field result, we make use of 
the method of stationary phase. Expanding the effective 
Hamiltonian (JSJ about the mean-field solution to leading 
order we have M' = ffi 1 (0 O ) + </>'M0' + 0(</> 3 ), where 
fa = <p — 0o and M is the Hessian matrix of second 
derivatives of Jif with respect to 0, evaluated at 0o. 
Changing variables to £ = Q0', where Q is the matrix of 
eigenvectors of M, M is diagonalized and Jif = J4?(4>q) + 
J2i + 0(| 3 ), with Xi being the «th eigenvalue of M. 
Substituting into Eq. J2J and observing that the Jacobian 
of the variable change |Q| = 1, the path integral becomes 
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FIG. 1: The mean-field solution for the connectance 0o — 
(k) /(n — 1) in the 2-star model from Eq. I1UL for values of 
the coupling J below, at, and above the phase transition. For 
the case J = 1.5 we are in the symmetry broken phase and 
the hysteresis loop corresponding to the high- and low-density 
phases of the system is clearly visible. Inset: the bifurcation 
of the connectance as a function of J along the symmetric line 
B = -J. 
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FIG. 2: The variance of vertex degree in the 2-star model as a 
function of the coupling J along the symmetric line B — — J. 
The phase transition is marked by a cusp in the variance, but 
no divergence. The solid line represents the analytic solution, 
Eq. 1171 . in the large system size limit, and the points are the 
results of Monte Carlo simulations of the model for n = 1000. 



a product of in dependent Gaussian integrals and Z = 
e -^(0o) /y[M|, or equivalently F = J^(c/>o) + \ In |M| 
where |M| is the determinant of M. 

The elements of the Hessian matrix have the values: 



Mi, 



giving 



-4J 2 o (l-0o) 



for i ^ j, 



(n-l)[2J-4J 2 o (l-0 o )] iori = j, 



(12) 



|M| = (2(n-l)J)"(l-2J0 o (l-0o))" 1 (l-4Jipo0--<h))- 

(13) 

Then, making use of Eqs. JHJ and i|10[l. we arrive at the 
solution for the free energy 

F = n(n - 1) J0o - \n{n - 1) ln(l + e 4J *> +2B ) 

+ |(n-l)ln(l-2J0 o (l-0 o )), (14) 

where we have kept leading order corrections to the 
mean-field result but dropped terms of order a constant 
and smaller that vanish in the large n limit. 

From the free energy we can calculate expected values 
of a variety of properties of the model. For instance the 
mean degree (fc) and the mean squared degree {k 2 } are 
given by derivatives with respect to B and J and are 
equal to 



(k) = (n - 1)0 O + 

2J0 o (l-0o)(l-20 o ) 
(1 - 4J0 O (1 - <M) (1 " 2 ^o(l - <M) 
(k 2 ) = {n-l) 2 <& + 

(n-l)0o(l-0 o )(l-4J0 2 ) 
(1 - 4J0 O (1 - O ))(1 - 2J0 O (1 - O )) ■ 



(15) 



(16) 



The leading order term in each case is the same as the 
mean- field result, so that in the limit of large n both (k) 
and (k 2 ) take their mean-field values. The variance of 
the degree (k 2 ) — (k) on the other hand is zero within 
the mean-field approximation because of the cancellation 
of the leading terms but non-zero beyond mean-field: 



(fc 2 ) - (kf = (n 1) 



0o (1 - <j> ) 
l-2J0 o (l-0o)' 



(17) 



From consideration of Fig.^one might expect this quan- 
tity to diverge at the phase transition, but in fact it does 
not, having merely a cusp at that point. In Fig. [21 we 
show the form of this function along the symmetric line 
B = — J as a function of J. The figure also shows the 
results of Monte Carlo simulations of the 2-star model 
for the same parameter values and, as we can see, agree- 
ment between the simulations and the analytic solution 
is excellent. 

A divergence does occur in the variance of the number 
of edges in the network at the phase transition. This 
quantity, which plays the role of a susceptibility for the 
model, is given to leading order by 



(to 2 ) — (to) = 



d 2 F 
dB 2 



1 



20p(l - 0Q ) 
l-4J0 o (l-0o)' 



(18) 



This diverges as | J — J c | _1 as we approach the transition 
along the symmetric line B = — J. 

One can also ask whether the network described by 
the 2-star model possesses a giant component. Molloy 
and Reed |8j] have demonstrated that a network without 
degree correlations possesses a giant component if and 
only if (k 2 ) > 2 (k). We can evaluate this criterion using 
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FIG. 3: The phase diagram for the 2-star model. The shaded 
region indicates the hysteretic region in which both high- and 
low-density phases are possible. 

Eqs. (|T5|l and ((1^1) . and find that for all values of the sys- 
tem parameters the network possesses a giant component 
in the limit of large n. 

In Fig. |21 we show the phase diagram for the 2-star 
model as a function of the parameters J and B. The criti- 
cal point is at J = 1, .B = — 1 and beyond this point there 
are high- and low-density phases separated by a phase co- 
existence region. In the coexistence region the phase of 
the model depends on its history in a manner charac- 
teristic of hysteretic systems. Some studies of exponen- 
tial random graphs have considered the case in which the 
number of edges in the graph is fixed, a "conserved-order- 
parameter" version of the current model |20| . In such a 
case, the phase coexistence region will correspond to true 
coexistence; low free-energy states of the system will be 
states in which the system prefers simultaneously to have 



some high-degree "hub" vertices that connect to essen- 
tially all others and some of lower degree, rather than be- 
ing uniform everywhere. Such "degenerate" behavior has 
been observed since the earliest numerical experiments on 
exponential random graphs [lH Il5l flti, |23| . Here we see 
that this behavior is the precise network analog of the 
phase separation phenomenon known to physicists from 
many other systems. 



IV. CONCLUSIONS 

In this paper, we have given a non-perturbative ana- 
lytic solution of one of the oldest of network models, the 
2-star model, which is perhaps the simplest nontrivial 
model of the class known as exponential random graphs 
and has been long studied in the social sciences. The 
model turns out to be perfectly suited to solution by the 
methods of statistical physics, and among other things 
the solution shows the degenerate behavior of the model 
in certain parameter regimes to be the result of a symme- 
try breaking between high- and low-density phases, which 
are separated from the "normal" region of the model by 
a continuous phase transition. 

The exponential random graphs are, we believe, an im- 
portant class of network models, which have largely been 
neglected despite the high level of interest in networks in 
the last few years. We hope that others will also take up 
the study of these models, either using methods like those 
discussed here or other methods yet to be described. 
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